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//ay received an application for a patent 
for a new and useful invention. The title 
and description of the invention are en- 
closed. The requirements of law have 
been complied with, and it has been de- 
termined that a patent on the invention 
shall be granted under the law. 

Therefore, this 

United States Patent 

Grants to the person or persons having 
title to this patent the right to exclude 
others from making, using or selling the 
invention throughout the United States 
of America for the term of seventeen 
years from the date of this patent, sub- 
ject to the payment of maintenance fees 
as provided by law. 
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NOTICE 

If the application for this patent was filed on or after December 12, 1980, maintenance fees 
are due three years and six months, seven years and six months, and eleven years and six 
months after the date of this grant, or within a grace period of six months thereafter upon 
payment of a surcharge as provided by law. The amount, number, and timing of the 
maintenance fees required may be changed by law or regulation. Unless payment of the 
applicable maintenance fee is received in the Patent and Trademark Office on or before the 
date the fee is due or within a grace period of six months thereafter, the patent will expire as of 
the end of such grace period. 
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ABSTRACT 



Feedback is introduced between a video CODEC and 
the intended communications channel such that the 
characteristics of the channel are used to drive multiple 
video output buffers. These multiple output buffers 
share an original temporal video reference, but have 
different subsequent temporal video images. The com- 
munications channel interface then picks the subsequent 
video image buffer that best matches the current condi- 
tions experienced by it. By using a predictor of the 
channel performance, the video algorithm can be tuned 
to provide video output buffers with the best guess of 
how the buffers should be configured. A number of 
subsequent histories of an image are buffered until the 
receiving channel indicates it is ready to receive the 
next. Then the appropriate output buffer having the 
corresponding temporal change in the video is used to 
supply the next frame change information to the receiv- 
ing station. 

1 Claim, 5 Drawing Sheets 
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MULTIPLE ENCODER OUTPUT BUFFER 
APPARATUS FOR DIFFERENTIAL CODING OF 
VIDEO INFORMATION 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to the transmission of 
information over a communications path. More particu- 
larly, the present invention relates to the communica- 
tions of high bandwidths information over networks of 
varying types. 

2. Art Background 

Until recently, telecommunications and computing 
were considered to be entirely separate disciplines. 
Telecommunications was analog and done in real time 
whereas computing was digital and performed at a rate 
determined by the processing speed of a computer. 
Today, such technologies as speech processing, elec- 
tronic mail and facsimile have blurred these lines. In the 
coming years, computing and telecommunications will 
become almost indistinguishable in a race to support a 
broad range of new multimedia (i.e., voice, video and 
data) applications. These applications are made possible 
by emerging digital-processing technologies, which 
include: compressed audio (both high fidelity audio and 
speech), high resolution still images, and compressed 
video. The emerging technologies will allow for collab- 
oration at a distance, including video conferencing. 

Of these technologies, video is particularly exciting in 
terms of its potential applications. But video is also the 
most demanding in terms of processing power and sheer 
volume of data to be processed. Uncompressed digital 
video requires somewhere between 50 and 200 Mb/s 
(megabits per second) to support the real-time transmis- 
sion of standard television quality images. This makes 
impractical the widespread use of uncompressed digital 
video in telecommunications applications. 

Fortunately, there is considerable redundancy in 
video data, both in terms of information theory and 
human perception. This redundancy allows for the 
compression of digital video sequences into lower trans- 
mission rates. For some time, researchers have been 
aware of a variety of techniques that can be used to 
compress video data sequences anywhere from 2:1 to 
1000:1, depending on the quality required by the appli- 
cation. Until recently, however, it was not practical to 
incorporate these techniques into low cost video-based 
applications. 

A number of standards have been recently developed 
for such activities as video conferencing, the transmis- 
sion and storage of standard high quality still images, as 
well as standards for interactive video playback to pro- 
vide interoperability between numerous communica- 
tions points. The standards recognize a need for quality 
video compression to reduce the tremendous amount of 
data required for the transmission of video information. 

Two important methods of data compression for 
video information are used widely throughout the vari- 
ous standards for video communication. These are the 
concepts of frame differencing and motion compensa- 
tion. Frame differencing recognizes that a normal video 
sequence has little variation from one frame to the next. 
If, instead of coding each frame, only the differences 
between a frame and the previous frame are coded, then 
the amount of information needed to describe the new 
frame will be dramatically reduced. Motion compensa- 
tion recognizes that much of the difference that does 
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occur between successive frames can be characterized 
as a simple translation of motion, caused either by the 
moving of objects in the scene or by a pan of the field of 
view. Rather than form a simple difference between 

5 blocks in a current frame and the same block in the 
previous frame, the area around those blocks can be 
searched in the previous frame to find an offset block 
that more closely matches the block of the current 
frame. Once a best match has been identified, the differ- 

10 ence between a reference block in the current frame and 
the best match in the previous frame are coded to pro- 
duce a vector that describes the offset of the best match. 
This motion vector then can be used with the previous 
frame to produce the equivalent of what the current 

15 frame should be. These methods, and others are incor- 
porated into systems which make possible the rapid 
transmission of real-time video information. 

As the worlds of telecommunications and computers 
blend closely together, the telecommunications aspects 
of communications will have to contend with some of 
the constraints of the computer world. Particularly, 
video conferencing over existing computer networks 
will prove a challenge in that mamtaining real time 

25 information communication over traffic-burdened exist- 
ing network protocols may prove insurmountable. 

Current video algorithms assume a nearly constant 
bandwidth availability for the encoding of video infor- 
mation. This is evidenced by the use of only a single 

30 output buffer for traditional video encoder output. It is 
common to use the output buffer fullness as a feedback 
parameter for encoding subsequent images; i.e., with 
higher or lower levels of quantization. A well-known 
effect resulting from using a single output buffer is 

3 5 called **bit-bang" where the output buffer is over de- 
pleted by the interface to the communications channel, 
causing the feedback loop to indicate that the buffer can 
handle lots of data, which in turn causes the video com- 
pression algorithm to under optimize the subsequent 

40 image coding. The user perceives the bit-bang as an 
uneven quality and frame rate. 

To alleviate bit-bang, the typical approach has been 
to limit the amount of data pulled out from the encoder 
video output buffer to a fraction of the total size of the 

45 output buffer; 10% to 30% is typical. This approach 
keeps the feedback indicator rather small, and encoding 
more uniform. The underlying assumption of this ap- 
proach is that the communications channel will usually 
not be changing rapidly. Exceptions are caused by con- 

50 nectivity interruptions, such as burst errors, which are 
handled strictly as exceptions to the call. In a local area 
network (LAN), or other collision-sensing multiple 
access channel, or in other networks with burst charac- 
teristics (such as noisy RF channels), this underlying 

55 assumption no longer holds. Over these sorts of commu- 
nications channels, unanticipated transmission delays 
may result in bit-bang problems which are not so readily 
overcome by limiting the size of the feedback buffer. 
Thus, video jerkiness will result in real-time video com- 

60 munication over such channels. It would be advanta- 
geous, and is therefore an object of the present inven- 
tion to provide a video transmission mechanism which 
can be accommodated on such potential bursty net- 
works. 

65 SUMMARY OF THE INVENTION 

From the foregoing, it can be appreciated that there is 
a need for a mechanism of incorporating real-time video 
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data communication over traditional network protocols described to be carried out by various logic circuits, 
to smooth video transmission. It is therefore an object of Those of ordinary skill in the art, having been described 
the present invention to provide a method and appara- the various functions will be able to implement the 
tus for the conveyance of video data over such net- necessary logic circuits without undue experimentation, 
works as local area networks. 5 FIG. 1 is used to illustrate a simple network having a 
These and other objects of the present invention are plurality of video-capable nodes. The network is illus- 
provided by introducing feedback between the video trated as a simple star network 10 having a centrally 
CODEC and the intended communications channel incorporated multi-point control unit (MCU). The net- 
such that the characteristics of the channel are used to work is presented as having five (5) nodes 11, 12, 13, 14 
drive multiple video output buffers. These multiple 10 and 15. For the purposes of explanation, these will all be 
output buffers share an original temporal video refer- considered video-capable nodes, with nodes 12 and 13 
ence, but have different subsequent temporal video supporting IRV video (160 pixels X 120 lines) while 
images. The communications channel interface then nodes 11, 14 and 15 support HQV video (320 pix- 
picks the subsequent video image buffer that best els X 240 lines). The network illustrated in FIG. 1 is 
matches the current conditions experienced by it By 15 purely for illustrative purposes and many more complex 
using a predictor of the channel performance, the video nodes may be incorporated that are non-video capable 
algorithm can be tuned to provide video output buffers on the same network as the illustrated nodes. Further, 
with the best guess of how the buffers should be config- the present invention may be applied to any network 
ured. A number of subsequent histories of an image are configuration besides the star configuration of FIG. 1 
buffered until the receiving channel indicates it is ready 20 such as token ring networks, branching tree networks, 
to receive the next Then the appropriate output buffer etc. The fundamental requirement for the network 
having the corresponding temporal change in the video which has these video-capable nodes is that the nodes 
is used to supply the next frame change information to be able to transmit data, including video data from one 
the receiving station. point to another and receive acknowledgments from the 

BRIEF DESCRIPTION OF THE DRAWINGS 25 '^fm'Srates typical video encoding hardware to 

The objects, features and advantages of the present which the present invention may be applied. This can be 

invention will be apparent from the following detailed used for preparing video data to be transmitted over a 

description in which: network of the type illustrated in FIG. 1 to provide 

FIG. 1 demonstrate a hypothetical network having a 30 real-time video conferencing. A video camera 20 re- 
plurality of video-capable nodes for interacting and ceives the video image that is to be encoded and con- 
providing video conferencing capabilities. veyed. Such cameras are common and work on a num- 

FTG. 2 illustrates hardware to be utilized in imple- ber of technologies such as charge coupled devices, etc. 

menting the present invention in one embodiment. The video camera may directly include video CODEC 

FIG. 3 illustrates a logical rendition of a plurality of 35 21 or it may be tightly coupled as illustrated in the 

output buffers with successive time interval video infor- figure. The video CODEC 21 receives the electronic 

mation for one embodiment of the present invention. image from the camera and digitizes the image when 

FIG. 4 illustrates a branching tree structure corre- being used in its encoder capacity. Video CODECS are 

sponding to successive temporal transmit reference generally known and come in a number of varieties 

images for one embodiment of the present invention. 40 which may be used for encoding video data to be trans- 

FIG. 5 illustrates alternative logical output buffer mitted and decoding video data when received. In FIG. 

uses for channel dependent data transmission over a 2, the camera output is propagated to the capture buffer 

network. 22 of video CODEC 21. 

FIG. 6 illustrates characteristics of audio information From the capture buffer 22, the video information is 
which may be transmitted over a network in accor- 45 processed by motion estimation circuitry 23. The mo- 
dance with another embodiment of the present inven- tion estimation circuitry is used to generate motion 
tion. vectors which describe the difference of a portion of a 

FIG. 7 illustrates a generalized block diagram of the video image from the previously recorded image in 

present invention. terms of a translational offset. The motion estimation 

nFTATT Fn nPSPRTPTION of THE 50 circuitry ""P™ 8 ^ currently decoded frame from 
DETAILED DESCRIPTION OF THE ^ previous frame stored in the transmit reference 
INVENTION image buffer 30 about which more will be described 
A method and apparatus are described for the con- further herein. From the motion estimation circuitry, 
veyance of real-time isochronous data over bursty net- the outputs are the motion vectors and the motion corn- 
works. Although the present invention is described 55 pensated image 24. The motion compensated image 24 
predominantly in terms of the transmission of video is then processed by the differential pulse code modula- 
information, the concepts and method are broad enough tion (DPCM) circuitry 25 which generates digital infor- 
to encompass the transmission of real-time audio and mation of the changes to the previously stored transmit 
other data requiring isochronous data transfer. reference image. Finally, a final stage of coding is done 
Throughout this detailed description, numerous details 60 at transform coding block 26 which also performs quan- 
are specified such as bit rates and frame sizes, in order to tization and run-length encoding. Run-length encoding 
provide a thorough understanding of the present inven- is a technique for compressing data sequences that have 
tion. To one skilled in the art, however, it will be under- large numbers of zeros and is well-known to those of 
stood that the present invention may be practiced with- ordinary skill in the art. This transform coder may per- 
out such specific details. In other instances, well-known 65 form a discrete cosine transform (DCT). 
control structures and gate level circuits have not been From the transform coding block, the coded se- 
shown in detail in order not to obscure unnecessarily quence is propagated to the output buffer 27 which is 
the present invention. Particularly, some functions are used to maintain a constant bit rate for the output to the 
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network. As was described, prior art methods used the transmit reference image and the image before the cam- 
output buffer fullness to regulate the degree of quantiza- era at successively later times. 

tion that would be applied to the compressing and en- The video encoder and camera circuitry described 
coding circuitry because a constant bandwidth avail- may be incorporated as part of a station that is on the 
ability was assumed. 5 network and are responsive to information received 
The transform coder block 26 also outputs the com- over the communications channel. When a given node 
pressed image data to generate a new transfer reference again has the bus, the output buffer with the most cur- 
image for storage in the transmit reference image buffer. rent image may be signaled to transmit its information 
The encoding logic provides the compressed image to the receiving node. Likewise, the channel informa- 
data to a decoding block 28 that has an inverse quan- 10 tion is used to then calculate the next transmit reference 
tizer and inverse discrete cosine transform decoder image for storage. The output buffers are then flushed 
which can be used to combine the decoded image data and are again loaded in a time sequential manner until 
with the previously stored transfer reference image to the data is again ready to be sent over the network, 
yield a new transmit reference image which corre- While four (4) output buffers are illustrated, this is 
sponds to the image that was most recently propagated 15 purely for illustrative purposes in that as many buffers 
on the network. It is this image data that would be used may be implemented as computing power and resources 
in calculating the changes in the image in sending the provide. 

next frame of information. In other words, the transmit FIG. 4 illustrates conceptually a branching tree that 
reference image, which is the same image that will be is pruned at times T= 1, T=2, T=3, etc., for each slice 
reconstructed at the other end by the video decoder, is 20 of information that is taken and propagated on the net- 
used as the basis of subsequent encoding, including work. This conceptualizes the use of multiple output 
motion vectors and motion compensated image com- buffers as a tree which is continually pruned with the 
pression. most current pruning corresponding to the present 

As was described in the previous section, the prior art transfer reference image, 

feedback mechanism using the output buffer assumed a FIG. 5 illustrates another conceptualization of the 

constant bit rate would be available for the transmission present invention. The encoder, through feedback from 

of information. This assumption no longer holds for the data communications channel, creates several logi- 

video conferencing type devices which are on bursty cal output buffers corresponding to behavioral predic- 

networks such as CSMA LAN networks. The solution 3Q tions based on the feedback from the communications 

proposed by the present invention is to provide feed- channel. For example, logical output buffer 1 could 

back between the video CODEC and the communica- represent the case where more bandwidth will be dy- 

tions channel such that the characteristics of the chan- namically allocated to this natural data compression 

nel are used to drive multiple video output buffers. over the next unit of time. The unit of time could be an 

These buffers share an original temporal video refer- 35 hnage frame or, for example, a frame of sampled audio, 

ence but will have different subsequent temporal video ^ FIG. 5, the various predictions of the bandwidth 

images. The communications channel interface then available to the compression algorithm are shown 

picks the subsequent video image buffer that best below in Table I. 

matches the current condition. By using a predictor of TABLE I 
the channel performance, the video algorithm can be 43 
tuned to provide video output buffers with the best 
guess of how the buffers should be configured. Once a 
particular output buffer's image data is selected, the 
remaining buffers can be flushed to be refilled again 
based on a newly calculated transmit reference image. 45 
In the limit, the final action is to revert to an exception 
handler similar to current video CODECS, i.e., insert a 

key frame to restart the encoding of video data trans- For video coding, more bandwidth could be used to 

mission. get sharper images and/or higher frame rate. The actual 

FIG. 3 illustrates conceptually the logical multiple 50 data contained in the logical output buffers can be sig- 

output buffers of the present invention. When the video nificantly different, too. For example, in video coding, 

camera 20 records an image it is encoded by the encod- the new transmit reference might be calculated from 

ing circuitry described above and the encoded informa- different input images in time and/or spatial resolution, 

tion is propagated to the output buffer 27. In a bursty Logical output buffer 1 might represent the data from 

network, the network may not be able to receive this 55 an image taken 1/1 5th of a second later than transmit 

newly calculated image data. Accordingly, the camera reference 0, while logical output buffer 2 might repre- 

continues to detect images and encode the data and sent the differential coding from an image half a second 

newly translated data is stored in subsequent output later from transmit reference 0. Such an approach 

buffers such as 41, 42 or 43. For example, the informa- would be good for video coding for channels where the 

tion stored in the output buffer 27 may correspond to 60 bit rate allocated to video may undergo extreme fluctu- 

the digital information equivalent to the changes from ations such as in the bursty networks described above, 

the transmit reference image stored in the transmit ref- While with reference to FIGS. 2 and 3, the output 

erence image buffer 30 at time t=0. In output buffer 41, buffers are illustrated as, for example, discrete memory 

the data information may correspond to die difference elements. FIG. 5 makes it clear that logical buffers may 

between the transfer reference image and 1/1 5th of a 65 be created in a common block of memory and that the 

second later than the data information stored in buffer number of such buffers is limited only by the available 

27. Likewise, output buffers 42 and 43 may store data computational power to simultaneously encode them 

corresponding to the temporal change between the and the memory to sufficiently handle them. FIG. 6 is 
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used to illustrate that the present invention is not neces- 
sarily limited to video encoding and illustrates a frame 
of audio information. For example, in the G.728 stan- 
dard each frame of data is 5 milliseconds long. The 
frame may be stored as a transmit reference and subse- 5 
quent transmissions may follow the differential coding 
principals wherein only the changed information is sent 
to the receiving node. The audio encoder may be re- 
sponsive to feedback from the network and maintain a 
plurality of logical output buffers such as those de- 10 
scribed in the video application. One possible applica- 
tion for such an implementation would be in wireless 
telephony wherein portions of an audio transmission 
may be lost when a transmitting station goes through a 
tunnel. The responding network indicates that its most 
recently received information is slightly stale and that a 
late change logical output buffer should be used in pro- 
viding the encoded differential information. 

In a more general description of the present inven- ^ 
tion, reference is now made to FIG. 7. Information 
about a real-time object 100 that is desired to be con- 
veyed from a transmitting node to a receiving node on 
some sort of network is shown. This real-time object 
100 may be a video image or it may be a sound depend- 2 5 
ing on the particular implementation. A capture mecha- 
nism 110 detects the real-time object and encodes it into 
electronic information. The capture mechanism may be 
a camera for video information as described above or a 
microphone or stereo microphones for audio informa- 30 
tion. This information is then processed by differential 
encoder 115 which compares the newly captured real- 
time object to the previously stored recorded object in 
transmit reference buffer 120. The differentially en- 
coded data is then propagated to a logical output buffer 35 
125 which operates as those described above. When the 
network clears the output buffers for transmission, the 
particular output buffer having the best information 
conveys it over the network and that same information 
is used to calculate a new transmit reference to be stored 40 
in transmit reference buffer 120. 



8 

There has thus been described a method and appara- 
tus of differential coding for use in bursty transmission 
networks which greatly improves the quality of trans- 
mitted compressed information. Although the present 
invention has been described in terms of preferred em- 
bodiments, it will be appreciated that various modifica- 
tions and alterations might be made by those skilled in 
the art without departing from the spirit and scope of 
the invention. The invention should, therefore, be mea- 
sured in terms of the claims which follow. 

What is claimed is: 

1. For use in a communications network having a 
plurality of nodes wherein a node may encode real-time 
information for propagating over said network, a 
method of processing said real-time information com- 
prising the steps of: 

providing said node with a plurality of output buffers; 

(a) electronically capturing said real-time information 
and converting it into electronic data; 

(b) differentially encoding said electronic data using a 
previously stored transmit reference image as a 
base to produce differential data; 

(c) storing said differential data in one of said plural- 
ity of output buffers; 

(d) monitoring said network for access to propagate 
said differential data; 

repeating steps (a)-(d) until said node may propagate 
said differential data over said network; 

transmitting data over said network from the one of 
said plurality of output buffers providing a best 
differential data to a receiving node on said net- 
work, wherein said best differential data represents 
a differential data whose use in conjunction with 
the previously stored transmit reference image 
produces an image that approximates a current 
frame better than use of other differential data 
contained in said plurality of output buffers; and 

calculating a new transmit reference image based on 

said best differential data and said previously 

stored transmit reference image. 

***** 
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